Minimax Lower Bounds for the Two-Armed Bandit Problem

نویسنده

Sanjeev R. Kulkarni

چکیده

We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a nite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins. Also, in contrast to the logn asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable con gurations of the two arms. That is, we show that for every allocation rule and for every n, there is a con guration such that the regret at time n is at least 1 times the regret of random guessing, where is any small positive constant. This work was supported in part by the National Science Foundation under NYI grant IRI-9457645.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finite-time lower bounds for the two-armed bandit problem

We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log asymptotic lower bound of Lai and Robbins. The finite-time lower bound allows us to derive conditions for the amount of time necessary to make any significant gain over a random guessing strategy. These bounds depend on the class of possible d...

متن کامل

Contributions to the Asymptotic Minimax Theorem for the Two-Armed Bandit Problem

The asymptotic minimax theorem for Bernoully twoarmed bandit problem states that the minimax risk has the order N as N → ∞, where N is the control horizon, and provides lower and upper estimates. It can be easily extended to normal two-armed bandit. For normal two-armed bandit, we generalize the asymptotic minimax theorem as follows: the minimax risk is approximately equal to 0.637N as N →∞. Ke...

متن کامل

The multi-armed bandit problem with covariates

We consider a multi-armed bandit problem in a setting where each arm produces a noisy reward realization which depends on an observable random covariate. As opposed to the traditional static multi-armed bandit problem, this setting allows for dynamically changing rewards that better describe applications where side information is available. We adopt a nonparametric model where the expected rewa...

متن کامل

Mistake Bounds on Noise-Free Multi-Armed Bandit Game

We study the {0, 1}-loss version of adaptive adversarial multi-armed bandit problems with α(≥ 1) lossless arms. For the problem, we show a tight bound K − α − Θ(1/T ) on the minimax expected number of mistakes (1-losses), where K is the number of arms and T is the number of rounds.

متن کامل

UDC 519.244.3 An Asymptotic Minimax Theorem for Gaussian Two-Armed Bandit

The asymptotic minimax theorem for Bernoulli two-armed bandit problem states that minimax risk has the order N as N → ∞, where N is the control horizon, and provides the estimates of the factor. For Gaussian twoarmed bandit with unit variances of one-step incomes and close expectations, we improve the asymptotic minimax theorem as follows: the minimax risk is approximately equal to 0.637N as N ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Minimax Lower Bounds for the Two-Armed Bandit Problem

نویسنده

چکیده

منابع مشابه

Finite-time lower bounds for the two-armed bandit problem

Contributions to the Asymptotic Minimax Theorem for the Two-Armed Bandit Problem

The multi-armed bandit problem with covariates

Mistake Bounds on Noise-Free Multi-Armed Bandit Game

UDC 519.244.3 An Asymptotic Minimax Theorem for Gaussian Two-Armed Bandit

عنوان ژورنال:

اشتراک گذاری